metanetwork: A R package dedicated to handling and representing trophic metanetworks

Abstract Trophic networks describe interactions between species at a given location and time. Due to environmental changes, anthropogenic perturbations or sampling effects, trophic networks may vary in space and time. The collection of network time series or networks in different sites thus constitutes a metanetwork. We present here the R package metanetwork, which will ease the representation, the exploration and analysis of trophic metanetwork data sets that are increasingly available. Our main methodological advance consists in suitable layout algorithm for trophic networks, which is based on trophic levels and dimension reduction in a graph diffusion kernel. In particular, it highlights relevant features of trophic networks (trophic levels, energetic channels). In addition, we developed tools to handle, compare visually and quantitatively and aggregate those networks. Static and dynamic visualisation functions have been developed to represent large networks. We apply our package workflow to several trophic network data sets.

The main issue in trophic network representation is still on providing a meaningful network layout related to ecological features, such as trophic levels or energetic channels (e.g. Elton, 1927;Van Leeuwen et al., 2015). Trophic networks are usually high dimensional with complex structure, while network layout is only a two-dimensional node embedding. Although network visualisation tools are now widely available (e.g. Bastian et al., 2009;Csardi & Nepusz, 2006;Pawluczuk & Iskrzyński, 2022;Perrone et al., 2020), current network layout methods highlighting hierarchical structure of trophic networks remain scarce (but see Hudson et al., 2013).
They mainly rely on force-directed algorithms, as Fruchterman and Reingold (1991) that is based on vertex repulsion or Kamada and Kawai (1989) and Gansner et al. (2004) that consists in spring embedding. None of them incorporate ecological processes. As a result, their outcomes on trophic networks are hard to interpret since these algorithms do not model ecological processes. Node layout algorithms specifically designed for trophic networks are still lacking.
Representing networks properly is an even more important issue as they are now sampled in space and time (CaraDonna et al., 2017;Dunne, 2006) as biogeography classically represents species in space (Lomolino et al., 2017;Von Humboldt & Bonpland, 1805).
Empirical evidence supports plasticity and stochasticity of interactions and would encourage sampling of trophic interactions through space and time (CaraDonna et al., 2017;de Aguiar et al., 2019;Poisot et al., 2015). However, sampling interactions in multiple sites is challenging since it requires joint observations of species. It is especially problematic when it involves organisms from different kingdoms and various body sizes (Jordano, 2016). Sampling taxa is far easier than sampling interactions, using naturalist knowledge (Moser et al., 2005), camera traps (Steenweg et al., 2017) or environmental DNA (Bohmann et al., 2014). A convenient case to study networks in space is then to build a potential network at the regional scale, the metaweb, using expert knowledge or machine learning methods to complete interaction databases (Strydom et al., 2021). Once the metaweb is built, local networks are deduced using sampled abundances. Such an approach have been used for various organisms, from terrestrial vertebrates (Braga et al., 2019;Galiana et al., 2014) to marine or freshwater communities (Blackman et al., 2022;Kéfi et al., 2015;Kortsch et al., 2019) or soil communities (Bauer et al., 2022). While losing interaction plasticity and stochasticity, local networks nevertheless have distinct structures due to sampling effect.
Hereafter, a collection of networks in space or time is called a metanetwork, as a collection of communities is called metacommunity.
For simplicity, we refer to the potential interaction network as the 'metaweb'. While trophic network databases are becoming increasingly available (Poelen et al., 2014), tools to handle and represent them remain scarce. The present paper describes and implements a new layout algorithm built for trophic networks, using trophic levels and a diffusion-based algorithm. This contribution also describes several additional methods to handle, represent and analyse trophic metanetworks at different resolutions as suggested in the literature (Guimarães, 2020;Thompson & Townsend, 2000). All the described methods are implemented in the R package, metanetwork, that eases manipulation and representation of trophic metanetworks. metanetwork is available on CRAN while several vignettes on several open data sets are accessible online at https://marco hlmann.github.io/ metan etwor k/.
We first describe inputs and methods to build and handle metanetwork objects. We then focus on the proposed 'TL-tse' and 'group-TL-tsne' layout algorithms and the visualisation methods wrapped in metanetwork. We also illustrate the use of the package on several datasets of various dimensions, including marine, soil and vertebrate trophic networks.

| Package installation and documentation
The latest stable version is available on CRAN and can be installed using: install.packages("metanetwork"). Complete documentation along with several vignettes describ- ing the examples of our paper is available here: https://marco hlmann.github.io/metan etwor k/.

| Inputs of the 'metanetwork' object
To build a potential metanetwork (hereafter metanetwork), we need a metaweb, G*, that is a directed and connected network including focal species and known potential trophic interactions in the study region. We can also include a community matrix P, indicating species relative abundances, and a trophic table  , indicating species belonging to broader taxonomic or functional groups. Local networks are then induced subnetworks of G* by local communities (with abundances).
Our package encodes a metanetwork through a R S3 object of class 'metanetwork'. The function build_metanet builds a 'metanetwork' object from the triplet G ⋆ , P,  and computes local networks. The metaweb G* must be of class 'igraph', 'matrix' or 'data. frame'. The matrix P and the table  can be NULL contrary to G*.
In this case, the metanetwork will be a single network. Although the metaweb needs to be connected, local networks can be disconnected, which may occur due to sampling effects. Figure 1 provides a sketch representation of the package functionalities, and Table 1 describes the main functions and their associated ecological questions.
Local networks constitute a list of 'igraph' objects with relative abundances, edge weights and network names stored as node, edge and graph attributes.

| Append aggregated networks
In order to investigate trophic networks at different aggregation levels (e.g. broader taxonomic groups, functional groups or output of node clustering algorithms) as suggested in Thompson and Townsend (2000), Ohlmann et al. (2019) and Guimarães (2020), our package can compute aggregated networks using the trophic table  that describes aggregation levels ( Figure 1). The nodes of the aggregated network are broader taxonomic groups given by the trophic table whereas edges are aggregated based on the structure of the original network. Considering aggregated networks is particularly welcome since nodes and edges composition can vary at the original resolution but be stable in the aggregated network. More formally, given a network G with n nodes, we can create Q groups from the original set of n nodes (Q < n) using  . We denote C 1 , … , C Q the focal groups or aggregated nodes. Their relative abundances p q 1≤q≤n and interaction probabilities ̃ ql 1≤q,l≤n are computed according to Ohlmann et al. (2019) as follows F I G U R E 1 Sketch representation of the use of the R package metanetwork from input data to output visualisation. It highlights the main functionalities of the package to handle and represent metanetworks.
where kk ′ is the link probability between nodes k in group C q and nodes k ′ in group C l , and p k and p k ′ are their respective relative abundances.
The method append_agg_nets computes the abundances and the link probabilities at any aggregation levels provided by the trophic table  . It then appends aggregated networks with node and edge attributes to the current 'metanetwork' object ( Figure 1).

| Representing and analysing metanetworks
We developed and implemented a new node layout algorithm specifically for trophic networks that we called 'TL-tsne', in reference to Trophic Levels and t-sne dimension reduction algorithm. Our 'TL-tsne' layout consists in a two-dimensional node embedding algorithm. It uses the trophic levels as the x-axis coordinates of the nodes in the two-dimensional space. The coordinates on the y-axis are computed using the diffusion kernel of the network (Kondor & Lafferty, 2002), which informs us on similarity between nodes according to a diffusion process, combined with a modified version of the 't-sne' algorithm, which allows reducing dimension (Van der Maaten & Hinton, 2008).
Moreover, we implemented functions to visualise and compare local networks but also metrics and indices to carry on quantitative analysis.

| Trophic levels computation
Trophic levels have been introduced to quantify the position in the hierarchy of resource acquisition (Lindeman, 1942). Despite various methods available to compute trophic levels (Hudson et al., 2013;Levine, 1980), we use the recent framework of MacKay et al. (2020), who define trophic level using the Laplacian matrix of the network because it embeds many useful properties of the network.
Let G be a directed network, we note A its adjacency matrix and D its degree diagonal matrix. The Laplacian matrix of the symmetrised version of G is defined by: where t(A) is the transpose of the adjacency matrix A. We note v = indegree(G) − outdegree(G) the imbalance vector. Then, the vector of the trophic levels, x, is the solution of the linear system: For a connected network, the solution x is unique up to a translation. Thus, we always fix its minimal entry to 0 (corresponding to basal species) and get the trophic level of all the other ones (more details in Appendix S1). In our package, we first compute the trophic levels from the metaweb G* because this graph is connected, thus we can fix the minimal trophic level to 0 and provide a trophic level for all other species. Since local networks might be disconnected due, for instance, to sampling effects, we compute the trophic levels in each connected component of the local network, and we fix the minimal trophic level in each component to its trophic level in the metaweb graph (see Appendix S1 for more details).
(1) The method compute_TL computes trophic levels and store them as node attributes of the networks belonging to the current 'metanetwork' object. These trophic levels are the x-axis coordinates of our node layout.

| Diffusion graph kernel and 'TL-tsne' layout algorithm
The core of our new layout method consists in the use of a diffusion graph kernel. Graph kernels consist in similarity matrices between nodes of a network based on its structural characteristics. Diffusion graph kernel computes similarity between nodes based on a diffusion process, capturing so path structure in the network (Kondor & Lafferty, 2002;Smola & Kondor, 2003). It is particularly suitable for our new network layout since it allows to cluster together nodes that are involved in similar paths. From the network G, we define the diffusion graph kernel K where L is the Laplacian matrix of G and β is the diffusion parameter, a scalar and strictly positive parameter. The latest part of the equation corresponds to the power series expansion of the matrix exponential, with k going from 0 to infinity. In our package, the diffusion kernel is computed through its eigenvalues (see Appendix S1). In the context of trophic networks, the diffusion process described by K, might represent diffusion of organic matter through the network. By doing so, nodes involved in the same paths (whatever their lengths) will have a high similarity. Increasing the diffusion parameter β will increase similarity values between nodes involved in the same paths while decreasing similarities between nodes involved in different paths. In order to compute the y-axis coordinate of the nodes in our layout of the network G, we need to reduce the information provided by the diffusion kernel K (that is of dimension the node number of G). We use a dimension reduction algorithm adapted from the t-sne algorithm (Van der Maaten & Hinton, 2008), which provides a low-dimensional embedding of high-dimensional data while preserving neighbourhood.
The t-sne method relies on an iterative algorithm, which minimises the Kullbach-Leibler divergence between similarity matrices in the highand low-dimensional space.
We use the diffusion kernel K to measure the similarity in the high-dimensional space (that is the set of neighbours in our network, which is fixed). We use the same low-dimensional similarity as in Van der Maaten and Hinton (2008). The x-axis coordinate is already fixed here to the trophic levels, while the second coordinate is chosen such that the Kullbach-Leibler divergence between the two similarity matrices is minimal. Importantly, the minimisation procedure accounts for trophic levels. We named 'TL-tsne' the proposed network layout algorithm (see Algo. S1 in Appendix S1). We also provide a method to evaluate the quality of the computed layout and to select β value using a modified version of Moran index (De Jong et al., 1984, see Appendix S1).
The method attach_layout computes 'TL-tsne' layout and store it as node attribute of the focal network.

| Visualisation
Besides proposing a new layout method, metanetwork package allows incorporating these layouts in the two recent R packages dedicated to network visualisation: 'ggnet' and 'visNetwork'. The

| Representing the difference between networks
In order to ease local network comparisons, metanetwork implements a function diff_plot that highlights differences and similarities between two network. More precisely, let G 1 and G 2 be two local networks (with vertex sets V 1 and V 2 ), we note G diff the difference network between G 1 and G 2 , whose vertex set is It is the induced subgraph of the metaweb, G ⋆ , by V diff . We assign then node abundances and edge weights to G diff . Node abundance of the difference network consists in the difference between node abundances of G 1 and G 2 , as edge weights. We use a colour code to distinguish nodes that are present in both networks with different abundances from nodes that are absent in one of the networks. A colour code in the visualisations indicates the sign of the node abundance difference and the edge weight difference between networks (see Figure 4 the following Section 3.1).

| Representing large networks with 'group-TL-tsne' layout
In order to represent networks with a large node number (typically larger than >100), we propose a variation of 'TL-tsne' layout that uses information from trophicTable. This specific layout method, called 'group-TL-tsne' uses the 'TL-tsne' layout at a desired aggregation level and combines it with 'igraph' layout_with_graphopt layout. We first compute the coordinates at the desired group resolution using 'TL-tsne' algorithm. We then compute, in each group, the coordinates of the nodes using 'igraph' layout centred at the coordinate of the group. A configuration object allows playing on group diameters. The attach_layout method computes 'group-TL-tsne' layout and store it as node attribute. Computing 'group-TL-tsne' layout is more computationally efficient since it computes 'TL-tsne' layout on the aggregated network (that is much smaller) only.

| Computing network metrics, diversities and dissimilarities
In order quantitatively assess network structure and compare local networks at the different resolutions, our package implements functions to compute network metrics (compute_metrics), network diversity (compute_div) and pairwise dissimilarity (compute_ dis) indices (see Figure 1). compute_metrics computes mean and max trophic levels of the metaweb and local networks (using the output of compute_TL()). This function also computes mean shortest path length (using 'igraph' mean_distance() function).
The package also implements the function compute_div that com-

| C A S E S TUD IE S
In this section, we apply metanetwork functions to three real-world metawebs, which correspond to different ecosystems with various organisms. In the main text, we use the static representation of the networks using ggmetanet while we provide interactive fr/app/ecolo gical -networks.

| Angola coastal network
We first look at a dataset from Angola, which has been extracted from Web of Life (http://www.web-of-life.es). It consists in a coastal trophic metaweb of 28 nodes (species or groups) and 127 interactions sampled along Angola's coastline (Angelini & Vaz-Velho, 2011 cording to the relative frequency of prey species in the diet of each predator species. We represented the metaweb using ggmetanet with 'TL-tsne' layout ( = 0.02) in Figure 3. The metaweb has two basal nodes, 'Phytoplankton' and 'Detritus', leading to a primary producer and detritus channel that mix up higher in the network. We included the Angola dataset as an example in the package (meta_angola object), with abundances built from biomasses in 1986 and 2003. We also represented the difference network between the two dates using the diff_plot function with the 'TL-tsne' layout ( = 0.05) in Figure 4. We also computed a profile of extended Moran index along beta values to select optimal β (see Figures S2 and S3).

| Norway soil trophic network
Norway soil trophic network dataset was extracted from Calderón- corresponds to the green energy channel while the channel starting from organic materials is the brown channel (Moore et al., 2004;Mougi, 2020;Polis & Strong, 1996). Importantly, we observe from our network representation that bacterial and fungal paths are separated in the brown channel. It means that they are linked to separated paths higher up in the network (e.g. bacterivore and fungivore groups).

Calderón-Sanou et al. (2021) documents the impact of a disturbance
(moth outbreaks) on soil diversity. We provide the difference network between pre-and post-disturbance ( Figure S4). It highlights a shift from Ectomycorrhizae and Ericoid mycorrhizae towards Arbuscular mycorrhizae and also an increase in soil predator abundances.

| Metaweb of European tetrapods
The metaweb of European tetrapods was extracted from Maiorano Model (SBM) that clusters nodes with similar connectivity patterns (Daudin et al., 2008). We represented the metaweb using 'TLtsne' layout ( = 3 × 10 −6 ) while flipping x and y coordinates (see flip_coords option). We mapped the 46 SBM groups using a combination of colours and shapes (see Figure S5). To get a more ordered representation, we used the 'group-TL-tsne' layout, that uses 'TL-tsne' layout at a SBM group resolution ( Figure 6, Figure S6).
Interestingly, some SBM groups are overlaying in the 'group-TL-tsne' layout. For basal species, the group containing many rodents of genus Microtus (purple squares) is mixed with the group containing many rodents of genus Spermophilus (pink squares). Higher up in the network, the group containing predator snakes of genera Hierophis and Montivipera (pink diamonds) is overlaying with group containing snakes of genera Vipera and Hemorrhois (purple diamonds).

| DISCUSS ION
We have presented metanetwork, a R package dedicated to handling and representing trophic metanetworks. These metanetworks are built from a metaweb, an abundance table and a possible information table on nodes. Potential local networks are then deduced from the metaweb and local abundances. While loosing local plasticity of F I G U R E 4 Difference network between the Angola network from 1986 and from 2003. Differences in node abundances are given by differences in estimated biomasses at the two time steps. We use the diff_plot function with the computed 'TL-tsne' metaweb layout (see layout_metaweb option) to visualise the difference network.
interactions, such an approach generates distinct local networks due to sampling effect. Recent studies aimed at unravelling the structure of local networks for different types of communities (Bauer et al., 2022;Kéfi et al., 2015;Kortsch et al., 2019).
The purpose of metanetwork R package is to provide representation tools for trophic networks and metanetworks. Representing networks consists in choosing an appropriate node layout algorithm and a suitable visualisation technique (Pocock et al., 2016). If visualisation techniques, wrapped in 'metanetwork', were widely available, a network layout algorithm specifically designed for trophic networks was sorely lacking. We developed the 'TL-tsne' network layout algorithm, which constitutes the main methodological development of the present paper. This new layout combines the computation of trophic levels, using the Laplacian matrix on the x-axis, with a non-linear dimension reduction in the graph diffusion kernel on the y-axis. Besides representing two different features, it allows reading the network along fixed axes. Our diffusion kernel method not only relies on edges, which corresponds to paths of length 1, but also on paths of arbitrary long length. As a result, our layout is less sensitive to the deletion of an edge or, more generally, to the mistakes in edge specification compared to force-based layouts that are very sensitive, as pointed out in Pocock et al. (2016). Moreover, paths of arbitrary length do have ecological interpretations in terms of energetic channels in the network. Notice that the proposed 'TLtsne' layout uses diffusion kernel on an undirected version of the considered network on the y-axis knowing the x-axis that takes into account directionality of the network since an imbalance term is present in Equation (3). The present method is then only designed for directed networks. Diffusion maps achieve a similar goal for embedding of points in space relying however on an undirected graph built from spatial coordinates (Coifman et al., 2005). We also notice the proximity of our method with node embedding algorithms using neural networks since they provide low dimension representation of networks using paths, as the proposed method (Khosla et al., 2019;Narayanan et al., 2017).
But, beyond technical concerns, 'TL-tsne' layout algorithm is suitable for trophic networks since it allows reading and interpreting the network along fixed axes contrary to traditional force-based layouts. These axes have an ecological interpretation involving energy diffusion in the network. More precisely, the first axis, the trophic levels, describes the hierarchy in the acquisition of resources.
Although this scalar quantity is not enough to summarise the F I G U R E 5 Norway soil trophic network, with 40 nodes and 204 edges. Nodes are coloured according to taxonomic groups and have relative abundances built from environmental DNA data. It is represented using 'TL-tsne' layout ( = 0.006) and ggmetanet visualisation.
network as pointed by the criticisms of this concept (Cousins, 1987), it is in line with a thermodynamic interpretation of trophic networks (Lindeman, 1942;Thompson et al., 2012). Using trophic level as first axis to represent trophic networks is almost consensual, as in the function PlotWebByLevel() from the 'cheddar' R package (Hudson et al., 2013) or in Potapov (2022). Contrary to 'cheddar' where the second axis is implicit or to Potapov (2022) where it is a species trait, the second axis in our layout represents an explicit complementary information related to diffusion of energy along the network and that can be computed without additional information about species. In our 'TL-tsne' layout, two species with similar trophic level may have different y-axis values, which indicates that they belong to different energetic channels. Such a pattern is illustrated in the Angola coastal network and Norway soil network where the 'TL-tsne' layout highlights two distinct channels for both networks: the green channel, linked to primary producers, (either phytoplankton or plants) and the brown channel, linked to detritus (Moore et al., 2004;Mougi, 2020;Polis & Strong, 1996). To our knowledge, this is the first network layout algorithm that highlights these channels on empirical trophic network data. This sheds new lights on a common structure shared by coastal and terrestrial communities, as previously suggested in the literature (Bramon Mora et al., 2018). Moreover, the diffusion parameter β, allows accentuating the separation between these different channels, as shown in Figure 2. Although the parameter β can be optimised numerically using extended Moran index, we however encourage the user of metanetwork to explore several β configurations in order to represent channels gradually separated from each other.
As a conclusion, our layout method based on diffusion processes, which highlights ecological processes such as organic matter diffu- It is represented using 'group-TL-tsne' layout, built from group layout ('TLtsne' with = 0.005) and ggmetanet visualisation. In this representation, the y-axis is the trophic level. The legend is constructed by taking the silhouette of a representative of each group on http:// phylo pic.org/. See Table S1 for credits.

ACK N OWLED G M ENTS
We thank Irene Calderon-Sanou for data curation and insightful discussions on the Norway metanetwork data set. This research was funded by the French Agence Nationale de la Recherche (ANR) through the GlobNet (ANR-16-CE02-0009) and EcoNet (ANR-18-CE02-0010) projects.

CO N FLI C T O F I NTE R E S T S TATE M E NT
The authors declare no conflict of interest.